# v0 Symbol Format

The v0 mangling format was introduced in [RFC 2603].
It has the following properties:

- It provides an unambiguous string encoding for everything that can end up in a binary's symbol table.
- It encodes information about generic parameters in a reversible way.
- The mangled symbols are *decodable* such that the demangled form should be easily identifiable as some concrete instance of e.g. a polymorphic function.
- It has a consistent definition that does not rely on pretty-printing certain language constructs.
- Symbols can be restricted to only consist of the characters `A-Z`, `a-z`, `0-9`, and `_`.
  This helps ensure that it is platform-independent,
  where other characters might have special meaning in some context (e.g. `.` for MSVC `DEF` files).
  Unicode symbols are optionally supported.
- It tries to stay efficient, avoiding unnecessarily long names,
  and avoiding computationally expensive operations to demangle.

The v0 format is not intended to be compatible with other mangling schemes (such as C++).

The v0 format is not presented as a stable ABI for Rust.
This format is currently intended to be well-defined enough that a demangler can produce a reasonable human-readable form of the symbol.
There are several implementation-defined portions that result in it not being possible to entirely predict how a given Rust entity will be encoded.

The sections below define the encoding of a v0 symbol.
There is no standardized demangled form of the symbols,
though suggestions are provided for how to demangle a symbol.
Implementers may choose to demangle in different ways.

## Extensions

This format may be extended in the future to add new tags as Rust is extended with new language items.
To be forward compatible, demanglers should gracefully handle symbols that have encodings where it encounters a tag character not described in this document.
For example, they may fall back to displaying the mangled symbol.
The format may be extended anywhere there is a tag character, such as the [type] rule.
The meaning of existing tags and encodings will not be changed.

## Grammar notation

The format of an encoded symbol is illustrated as a context free grammar in an extended BNF-like syntax.
A consolidated summary can be found in the [Symbol grammar summary][summary].

| Name | Syntax | Example | Description |
|------|--------|---------|-------------|
| Rule | →      | <nobr>A → *B* *C*</nobr> | A production. |
| Concatenation | whitespace | <nobr>A → *B* *C* *D*</nobr> | Individual elements in sequence left-to-right. |
| Alternative | \| | <nobr>A → *B* \| *C*</nobr> | Matches either one or the other. |
| Grouping | () | <nobr>A → *B* (*C* \| *D*) *E*</nobr> | Groups multiple elements as one. |
| Repetition | {} | <nobr>A → {*B*}</nobr> | Repeats the enclosed zero or more times. |
| Option | <sub>opt</sub> | <nobr>A → *B*<sub>opt</sub> *C*</nobr> | An optional element. |
| Literal | `monospace` | <nobr>A → `G`</nobr> | A terminal matching the exact characters case-sensitive. |

## Symbol name
[symbol-name]: #symbol-name

> symbol-name → `_R` *[decimal-number]*<sub>opt</sub> *[path]* *[instantiating-crate]*<sub>opt</sub> *[vendor-specific-suffix]*<sub>opt</sub>

A mangled symbol starts with the two characters `_R` which is a prefix to identify the symbol as a Rust symbol.
The prefix can optionally be followed by a *[decimal-number]* which specifies the encoding version.
This number is currently not used, and is never present in the current encoding.
Following that is a *[path]* which encodes the path to an entity.
The path is followed by an optional *[instantiating-crate]* which helps to disambiguate entities which may be instantiated multiple times in separate crates.
The final part is an optional *[vendor-specific-suffix]*.

> **Recommended Demangling**
>
> A *symbol-name* should be displayed as the *[path]*.
> The *[instantiating-crate]* and the *[vendor-specific-suffix]* usually need not be displayed.

> Example:
> ```rust
> std::path::PathBuf::new();
> ```
>
> The symbol for `PathBuf::new` in crate `mycrate` is:
>
> ```text
> _RNvMsr_NtCs3ssYzQotkvD_3std4pathNtB5_7PathBuf3newCs15kBYyAo9fc_7mycrate
> ├┘└───────────────────────┬──────────────────────┘└──────────┬─────────┘
> │                         │                                  │
> │                         │                                  └── instantiating-crate path "mycrate"
> │                         └───────────────────────────────────── path to std::path::PathBuf::new
> └─────────────────────────────────────────────────────────────── `_R` symbol prefix
> ```
>
> Recommended demangling: `<std::path::PathBuf>::new`

## Symbol path
[path]: #symbol-path

> path → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[crate-root]* \
> &nbsp;&nbsp; | *[inherent-impl]* \
> &nbsp;&nbsp; | *[trait-impl]* \
> &nbsp;&nbsp; | *[trait-definition]* \
> &nbsp;&nbsp; | *[nested-path]* \
> &nbsp;&nbsp; | *[generic-args]* \
> &nbsp;&nbsp; | *[backref]*

A *path* represents a variant of a [Rust path][reference-paths] to some entity.
In addition to typical Rust path segments using identifiers,
it uses extra elements to represent unnameable entities (like an `impl`) or generic arguments for monomorphized items.

The initial tag character can be used to determine which kind of path it represents:

| Tag | Rule | Description |
|-----|------|-------------|
| `C` | *[crate-root]* | The root of a crate path. |
| `M` | *[inherent-impl]* | An inherent implementation. |
| `X` | *[trait-impl]* | A trait implementation. |
| `Y` | *[trait-definition]* | A trait definition. |
| `N` | *[nested-path]* | A nested path. |
| `I` | *[generic-args]* | Generic arguments. |
| `B` | *[backref]* | A back reference. |

### Path: Crate root
[crate-root]: #path-crate-root

> crate-root → `C` *[identifier]*

A *crate-root* indicates a path referring to the root of a crate's module tree.
It consists of the character `C` followed by the crate name as an *[identifier]*.

The crate name is the name as seen from the defining crate.
Since Rust supports linking multiple crates with the same name,
the *[disambiguator]* is used to make the name unique across the crate graph.

> **Recommended Demangling**
>
> A *crate-root* can be displayed as the identifier such as `mycrate`.
>
> Usually the disambiguator in the identifier need not be displayed,
> but as an alternate form the disambiguator can be shown in hex such as
> `mycrate[ca63f166dbe9294]`.

> Example:
> ```rust
> fn example() {}
> ```
>
> The symbol for `example` in crate `mycrate` is:
>
> ```text
> _RNvCs15kBYyAo9fc_7mycrate7example
>     │└────┬─────┘││└──┬──┘
>     │     │      ││   │
>     │     │      ││   └── crate-root identifier "mycrate"
>     │     │      │└────── length 7 of "mycrate"
>     │     │      └─────── end of base-62-number
>     │     └────────────── disambiguator for crate-root "mycrate" 0xca63f166dbe9293 + 1
>     └──────────────────── crate-root
> ```
>
> Recommended demangling: `mycrate::example`

### Path: Inherent impl
[inherent-impl]: #path-inherent-impl

> inherent-impl → `M` *[impl-path]* *[type]*

An *inherent-impl* indicates a path to an [inherent implementation][reference-inherent-impl].
It consists of the character `M` followed by an *[impl-path]*, which uniquely identifies the impl block the item is defined in.
Following that is a *[type]* representing the `Self` type of the impl.

> **Recommended Demangling**
>
> An *inherent-impl* can be displayed as a qualified path segment to the *[type]* within angled brackets.
> The *[impl-path]* usually need not be displayed.

> Example:
> ```rust
> struct Example;
> impl Example {
>     fn foo() {}
> }
> ```
>
> The symbol for `foo` in the impl for `Example` is:
>
> ```text
> _RNvMs_Cs4Cv8Wi1oAIB_7mycrateNtB4_7Example3foo
>     │├┘└─────────┬──────────┘└────┬──────┘
>     ││           │                │
>     ││           │                └── Self type "Example"
>     ││           └─────────────────── path to the impl's parent "mycrate"
>     │└─────────────────────────────── disambiguator 1
>     └──────────────────────────────── inherent-impl
> ```
>
> Recommended demangling: `<mycrate::Example>::foo`

### Path: Trait impl
[trait-impl]: #path-trait-impl

> trait-impl → `X` *[impl-path]* *[type]* *[path]*

A *trait-impl* indicates a path to a [trait implementation][reference-trait-impl].
It consists of the character `X` followed by an *[impl-path]* to the impl's parent followed by the *[type]* representing the `Self` type of the impl followed by a *[path]* to the trait.

> **Recommended Demangling**
>
> A *trait-impl* can be displayed as a qualified path segment using the `<` *type* `as` *path* `>` syntax.
> The *[impl-path]* usually need not be displayed.

> Example:
> ```rust
> struct Example;
> trait Trait {
>     fn foo();
> }
> impl Trait for Example {
>     fn foo() {}
> }
> ```
>
> The symbol for `foo` in the trait impl for `Example` is:
>
> ```text
> _RNvXCs15kBYyAo9fc_7mycrateNtB2_7ExampleNtB2_5Trait3foo
>     │└─────────┬──────────┘└─────┬─────┘└────┬────┘
>     │          │                 │           │
>     │          │                 │           └── path to the trait "Trait"
>     │          │                 └────────────── Self type "Example"
>     │          └──────────────────────────────── path to the impl's parent "mycrate"
>     └─────────────────────────────────────────── trait-impl
> ```
>
> Recommended demangling: `<mycrate::Example as mycrate::Trait>::foo`

### Path: Impl
[impl-path]: #path-impl

> impl-path → *[disambiguator]*<sub>opt</sub> *[path]*

An *impl-path* is a path used for *[inherent-impl]* and *[trait-impl]* to indicate the path to parent of an [implementation][reference-implementations].
It consists of an optional *[disambiguator]* followed by a *[path]*.
The *[path]* is the path to the parent that contains the impl.
The *[disambiguator]* can be used to distinguish between multiple impls within the same parent.

> **Recommended Demangling**
>
> An *impl-path* usually need not be displayed (unless the location of the impl is desired).

> Example:
> ```rust
> struct Example;
> impl Example {
>     fn foo() {}
> }
> impl Example {
>     fn bar() {}
> }
> ```
>
> The symbol for `foo` in the impl for `Example` is:
>
> ```text
> _RNvMCs7qp2U7fqm6G_7mycrateNtB2_7Example3foo
>      └─────────┬──────────┘
>                │
>                └── path to the impl's parent crate-root "mycrate"
> ```
>
> The symbol for `bar` is similar, though it has a disambiguator to indicate it is in a different impl block.
>
> ```text
> _RNvMs_Cs7qp2U7fqm6G_7mycrateNtB4_7Example3bar
>      ├┘└─────────┬──────────┘
>      │           │
>      │           └── path to the impl's parent crate-root "mycrate"
>      └────────────── disambiguator 1
> ```
>
> Recommended demangling:
> * `foo`: `<mycrate::Example>::foo`
> * `bar`: `<mycrate::Example>::bar`

### Path: Trait definition
[trait-definition]: #path-trait-definition

> trait-definition → `Y` *[type]* *[path]*

A *trait-definition* is a path to a [trait definition][reference-traits].
It consists of the character `Y` followed by the *[type]* which is the `Self` type of the referrer, followed by the *[path]* to the trait definition.

> **Recommended Demangling**
>
> A *trait-definition* can be displayed as a qualified path segment using the `<` *type* `as` *path* `>` syntax.

> Example:
> ```rust
> trait Trait {
>     fn example() {}
> }
> struct Example;
> impl Trait for Example {}
> ```
>
> The symbol for `example` in the trait `Trait` implemented for `Example` is:
>
> ```text
> _RNvYNtCs15kBYyAo9fc_7mycrate7ExampleNtB4_5Trait7exampleB4_
>     │└──────────────┬───────────────┘└────┬────┘
>     │               │                     │
>     │               │                     └── path to the trait "Trait"
>     │               └──────────────────────── path to the implementing type "mycrate::Example"
>     └──────────────────────────────────────── trait-definition
> ```
>
> Recommended demangling: `<mycrate::Example as mycrate::Trait>::example`

### Path: Nested path
[nested-path]: #path-nested-path

> nested-path → `N` *[namespace]* *[path]* *[identifier]*

A *nested-path* is a path representing an optionally named entity.
It consists of the character `N` followed by a *[namespace]* indicating the namespace of the entity,
followed by a *[path]* which is a path representing the parent of the entity,
followed by an *[identifier]* of the entity.

The identifier of the entity may have a length of 0 when the entity is not named.
For example, entities like closures, tuple-like struct constructors, and anonymous constants may not have a name.
The identifier may still have a disambiguator unless the disambiguator is 0.

> **Recommended Demangling**
>
> A *nested-path* can be displayed by first displaying the *[path]* followed by a `::` separator followed by the *[identifier]*.
> If the *[identifier]* is empty, then the separating `::` should not be displayed.
>
> If a *[namespace]* is specified, then extra context may be added such as: \
> *[path]* `::{` *[namespace]* (`:` *[identifier]*)<sub>opt</sub> `#` *disambiguator*<sub>as base-10 number</sub> `}`
>
> Here the namespace `C` may be printed as `closure` and `S` as `shim`.
> Others may be printed by their character tag.
> The `:` *name* portion may be skipped if the name is empty.
>
> The *[disambiguator]* in the *[identifier]* may be displayed if a *[namespace]* is specified.
> In other situations, it is usually not necessary to display the *[disambiguator]*.
> If it is displayed, it is recommended to place it in brackets, for example `[284a76a8b41a7fd3]`.
> If the *[disambiguator]* is not present, then its value is 0 and it can always be omitted from display.

> Example:
> ```rust
> fn main() {
>     let x = || {};
>     let y = || {};
>     x();
>     y();
> }
> ```
>
> The symbol for the closure `x` in crate `mycrate` is:
>
> ```text
> _RNCNvCsgStHSCytQ6I_7mycrate4main0B3_
>   ││└─────────────┬─────────────┘│
>   ││              │              │
>   ││              │              └── identifier with length 0
>   ││              └───────────────── path to "mycrate::main"
>   │└──────────────────────────────── closure namespace
>   └───────────────────────────────── nested-path
> ```
>
> The symbol for the closure `y` is similar, with a disambiguator:
>
> ```text
> _RNCNvCsgStHSCytQ6I_7mycrate4mains_0B3_
>                                  ││
>                                  │└── base-62-number 0
>                                  └─── disambiguator 1 (base-62-number+1)
> ```
>
> Recommended demangling:
> * `x`: `mycrate::main::{closure#0}`
> * `y`: `mycrate::main::{closure#1}`

### Path: Generic arguments
[generic-args]: #path-generic-arguments
[generic-arg]: #path-generic-arguments

> generic-args → `I` *[path]* {*[generic-arg]*} `E`
>
> generic-arg → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[lifetime]* \
> &nbsp;&nbsp; | *[type]* \
> &nbsp;&nbsp; | `K` *[const]*

A *generic-args* is a path representing a list of generic arguments.
It consists of the character `I` followed by a *[path]* to the defining entity, followed by zero or more <em>[generic-arg]</em>s terminated by the character `E`.

Each *[generic-arg]* is either a *[lifetime]* (starting with the character `L`), a *[type]*, or the character `K` followed by a *[const]* representing a const argument.

> **Recommended Demangling**
>
> A *generic-args* may be printed as: *[path]* `::`<sub>opt</sub> `<` comma-separated list of args `>`
> The `::` separator may be elided for type paths (similar to Rust's rules).

> > Example:
> ```rust
> fn main() {
>     example([123]);
> }
>
> fn example<T, const N: usize>(x: [T; N]) {}
> ```
>
> The symbol for the function `example` is:
>
> ```text
> _RINvCsgStHSCytQ6I_7mycrate7examplelKj1_EB2_
>   │└──────────────┬───────────────┘││││││
>   │               │                │││││└── end of generic-args
>   │               │                ││││└─── end of const-data
>   │               │                │││└──── const value `1`
>   │               │                ││└───── const type `usize`
>   │               │                │└────── const generic
>   │               │                └─────── generic type i32
>   │               └──────────────────────── path to "mycrate::example"
>   └──────────────────────────────────────── generic-args
> ```
>
> Recommended demangling: `mycrate::example::<i32, 1>`

### Namespace
[namespace]: #namespace

> namespace → *[lower]* | *[upper]*

A *namespace* is used to segregate names into separate logical groups, allowing identical names to otherwise avoid collisions.
It consists of a single character of an upper or lowercase ASCII letter.
Lowercase letters are reserved for implementation-internal disambiguation categories (and demanglers should never show them).
Uppercase letters are used for special namespaces which demanglers may display in a special way.

Uppercase namespaces are:

* `C` — A closure.
* `S` — A shim. Shims are added by the compiler in some situations where an intermediate is needed.
  For example, a `fn()` pointer to a function with the [`#[track_caller]` attribute][reference-track_caller] needs a shim to deal with the implicit caller location.

> **Recommended Demangling**
>
> See *[nested-path]* for recommended demangling.

## Identifier
[identifier]: #identifier
[undisambiguated-identifier]: #identifier
[bytes]: #identifier

> identifier → *[disambiguator]*<sub>opt</sub> *[undisambiguated-identifier]*
>
> undisambiguated-identifier → `u`<sub>opt</sub> *[decimal-number]* `_`<sub>opt</sub> *[bytes]*
>
> bytes → {*UTF-8 bytes*}

An *identifier* is a named label used in a *[path]* to refer to an entity.
It consists of an optional *[disambiguator]* followed by an *[undisambiguated-identifier]*.

The disambiguator is used to disambiguate identical identifiers that should not otherwise be considered the same.
For example, closures have no name, so the disambiguator is the only differentiating element between two different closures in the same parent path.

The undisambiguated-identifier starts with an optional `u` character,
which indicates that the identifier is encoded in [Punycode][Punycode identifiers].
The next part is a *[decimal-number]* which indicates the length of the *bytes*.

Following the identifier size is an optional `_` character which is used to separate the length value from the identifier itself.
The `_` is mandatory if the *bytes* starts with a decimal digit or `_` in order to keep it unambiguous where the *decimal-number* ends and the *bytes* starts.

*bytes* is the identifier itself encoded in UTF-8.

> **Recommended Demangling**
>
> The display of an *identifier* can depend on its context.
> If it is Punycode-encoded, then it may first be decoded before being displayed.
>
> The *[disambiguator]* may or may not be displayed; see recommendations for rules that use *identifier*.

### Punycode identifiers
[Punycode identifiers]: #punycode-identifiers

Because some environments are restricted to ASCII alphanumerics and `_`,
Rust's [Unicode identifiers][reference-identifiers] may be encoded using a modified version of [Punycode].

For example, the function:

```rust
mod gödel {
  mod escher {
    fn bach() {}
  }
}
```

would be mangled as:

```text
_RNvNtNtCsgOH4LzxkuMq_7mycrateu8gdel_5qa6escher4bach
                              ││└───┬──┘
                              ││    │
                              ││    └── gdel_5qa translates to gödel
                              │└─────── 8 is the length
                              └──────── `u` indicates it is a Unicode identifier
```

Standard Punycode generates strings of the form `([[:ascii:]]+-)?[[:alnum:]]+`.
This is problematic because the `-` character
(which is used to separate the ASCII part from the base-36 encoding)
is not in the supported character set for symbols.
For this reason, `-` characters in the Punycode encoding are replaced with `_`.

Here are some examples:

| Original        | Punycode        | Punycode + Encoding |
|-----------------|-----------------|---------------------|
| føø             | f-5gaa          | f_5gaa              |
| α_ω             | _-ylb7e         | __ylb7e             |
| 铁锈             | n84amf          | n84amf              |
| 🤦              | fq9h            | fq9h                |
| ρυστ            | 2xaedc          | 2xaedc              |

> Note: It is up to the compiler to decide whether or not to encode identifiers using Punycode or not.
> Some platforms may have native support for UTF-8 symbols,
> and the compiler may decide to use the UTF-8 encoding directly.
> Demanglers should be prepared to support either form.

[Punycode]: https://tools.ietf.org/html/rfc3492

## Disambiguator
[disambiguator]: #disambiguator

> disambiguator → `s` *[base-62-number]*

A *disambiguator* is used in various parts of a symbol *[path]* to uniquely identify path elements that would otherwise be identical but should not be considered the same.
It starts with the character `s` and is followed by a *[base-62-number]*.

If the *disambiguator* is not specified, then its value can be assumed to be zero.
Otherwise, when demangling, the value 1 should be added to the *[base-62-number]*
(thus a *base-62-number* of zero encoded as `_` has a value of 1).
This allows disambiguators that are encoded sequentially to use minimal bytes.

> **Recommended Demangling**
>
> The *disambiguator* may or may not be displayed; see recommendations for rules that use *disambiguator*.

## Lifetime
[lifetime]: #lifetime

> lifetime → `L` *[base-62-number]*

A *lifetime* is used to encode an anonymous (numbered) lifetime, either erased or [higher-ranked](#binder).
It starts with the character `L` and is followed by a *[base-62-number]*.
Index 0 is always erased.
Indices starting from 1 refer (as de Bruijn indices) to a higher-ranked lifetime bound by one of the enclosing <em>[binder]</em>s.

> **Recommended Demangling**
>
> A *lifetime* may be displayed like a Rust lifetime using a single quote.
>
> Index 0 should be displayed as `'_`.
> Index 0 should not be displayed for lifetimes in a *[ref-type]*, *[mut-ref-type]*, or *[dyn-trait-type]*.
>
> A lifetime can be displayed by converting the De Bruijn index to a De Bruijn level
> (level = number of bound lifetimes - index) and selecting a unique name for each level.
> For example, starting with single lowercase letters such as `'a` for level 0.
> Levels over 25 may consider printing the numeric lifetime as in `'_123`.
> See *[binder]* for more on lifetime indexes and ordering.

> Example:
> ```rust
> fn main() {
>     example::<fn(&u8, &u16)>();
> }
>
> pub fn example<T>() {}
> ```
>
> The symbol for the function `example` is:
>
> ```text
> _RINvCs7qp2U7fqm6G_7mycrate7exampleFG0_RL1_hRL0_tEuEB2_
>                                    │└┬┘│└┬┘││└┬┘││
>                                    │ │ │ │ ││ │ │└── end of input types
>                                    │ │ │ │ ││ │ └─── type u16
>                                    │ │ │ │ ││ └───── lifetime #1 'b
>                                    │ │ │ │ │└─────── reference type
>                                    │ │ │ │ └──────── type u8
>                                    │ │ │ └────────── lifetime #2 'a
>                                    │ │ └──────────── reference type
>                                    │ └────────────── binder with 2 lifetimes
>                                    └──────────────── function type
> ```
>
> Recommended demangling: `mycrate::example::<for<'a, 'b> fn(&'a u8, &'b u16)>`

## Const
[const]: #const
[const-data]: #const
[hex-digit]: #const

> const → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[type]* *[const-data]* \
> &nbsp;&nbsp; | `p` \
> &nbsp;&nbsp; | *[backref]*
>
> const-data → `n`<sub>opt</sub> {*[hex-digit]*} `_`
>
> [hex-digit] → *[digit]* | `a` | `b` | `c` | `d` | `e` | `f`

A *const* is used to encode a const value used in generics and types.
It has the following forms:

* A constant value encoded as a *[type]* which represents the type of the constant and *[const-data]* which is the constant value, followed by `_` to terminate the *const*.
* The character `p` which represents a [placeholder].
* A *[backref]* to a previously encoded *const* of the same value.

The encoding of the *const-data* depends on the type:

* `bool` — The value `false` is encoded as `0_`, the value true is encoded as `1_`.
* `char` — The Unicode scalar value of the character is encoded in hexadecimal.
* Unsigned integers — The value is encoded in hexadecimal.
* Signed integers — The character `n` is a prefix to indicate that it is negative,
  followed by the absolute value encoded in hexadecimal.

> **Recommended Demangling**
>
> A *const* may be displayed by the const value depending on the type.
>
> The `p` placeholder should be displayed as the `_` character.
>
> For specific types:
> * `b` (bool) — Display as `true` or `false`.
> * `c` (char) — Display the character in as a Rust character (such as `'A'` or `'\n'`).
> * integers — Display the integer (either in decimal or hex).

> Example:
> ```rust
> fn main() {
>     example::<0x12345678>();
> }
>
> pub fn example<const N: u64>() {}
> ```
>
> The symbol for function `example` is:
>
> ```text
> _RINvCs7qp2U7fqm6G_7mycrate7exampleKy12345678_EB2_
>                                    ││└───┬───┘
>                                    ││    │
>                                    ││    └── const-data 0x12345678
>                                    │└─────── const type u64
>                                    └──────── const generic arg
> ```
>
> Recommended demangling: `mycrate::example::<305419896>`

### Placeholders
[placeholder]: #placeholders

A *placeholder* may occur in circumstances where a type or const value is not relevant.

> Example:
> ```rust
> pub struct Example<T, const N: usize>([T; N]);
>
> impl<T, const N: usize> Example<T, N> {
>     pub fn foo() -> &'static () {
>         static EXAMPLE_STATIC: () = ();
>         &EXAMPLE_STATIC
>     }
> }
> ```
>
> In this example, the static `EXAMPLE_STATIC` would not be monomorphized by the type or const parameters `T` and `N`.
> Those will use the placeholder for those generic arguments.
> Its symbol is:
>
> ```text
> _RNvNvMCsd9PVOYlP1UU_7mycrateINtB4_7ExamplepKpE3foo14EXAMPLE_STATIC
>                              │             │││
>                              │             ││└── const placeholder
>                              │             │└─── const generic argument
>                              │             └──── type placeholder
>                              └────────────────── generic-args
> ```
>
> Recommended demangling: `<mycrate::Example<_, _>>::foo::EXAMPLE_STATIC`


## Type
[type]: #type
[basic-type]: #basic-type
[array-type]: #array-type
[slice-type]: #slice-type
[tuple-type]: #tuple-type
[ref-type]: #ref-type
[mut-ref-type]: #mut-ref-type
[const-ptr-type]: #const-ptr-type
[mut-ptr-type]: #mut-ptr-type
[fn-type]: #fn-type
[dyn-trait-type]: #dyn-trait-type

> type → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[basic-type]* \
> &nbsp;&nbsp; | *[array-type]* \
> &nbsp;&nbsp; | *[slice-type]* \
> &nbsp;&nbsp; | *[tuple-type]* \
> &nbsp;&nbsp; | *[ref-type]* \
> &nbsp;&nbsp; | *[mut-ref-type]* \
> &nbsp;&nbsp; | *[const-ptr-type]* \
> &nbsp;&nbsp; | *[mut-ptr-type]* \
> &nbsp;&nbsp; | *[fn-type]* \
> &nbsp;&nbsp; | *[dyn-trait-type]* \
> &nbsp;&nbsp; | *[path]* \
> &nbsp;&nbsp; | *[backref]*

A *type* represents a Rust [type][reference-types].
The initial character can be used to distinguish which type is encoded.
The type encodings based on the initial tag character are:

* A <span id="basic-type">*basic-type*</span> is encoded as a single character:
  * `a` — `i8`
  * `b` — `bool`
  * `c` — `char`
  * `d` — `f64`
  * `e` — `str`
  * `f` — `f32`
  * `h` — `u8`
  * `i` — `isize`
  * `j` — `usize`
  * `l` — `i32`
  * `m` — `u32`
  * `n` — `i128`
  * `o` — `u128`
  * `s` — `i16`
  * `t` — `u16`
  * `u` — unit `()`
  * `v` — variadic `...`
  * `x` — `i64`
  * `y` — `u64`
  * `z` — `!`
  * `p` — [placeholder] `_`

* `A` — An [array][reference-array] `[T; N]`.

  > <span id="array-type">array-type</span> → `A` *[type]* *[const]*

  The tag `A` is followed by the *[type]* of the array followed by a *[const]* for the array size.

* `S` — A [slice][reference-slice] `[T]`.

  > <span id="slice-type">slice-type</span> → `S` *[type]*

  The tag `S` is followed by the *[type]* of the slice.

* `T` — A [tuple][reference-tuple] `(T1, T2, T3, ...)`.

  > <span id="tuple-type">tuple-type</span> → `T` {*[type]*} `E`

  The tag `T` is followed by one or more <em>[type]</em>s indicating the type of each field, followed by a terminating `E` character.

  Note that a zero-length tuple (unit) is encoded with the `u` *[basic-type]*.

* `R` — A [reference][reference-shared-reference] `&T`.

  > <span id="ref-type">ref-type</span> →  `R` *[lifetime]*<sub>opt</sub> *[type]*

  The tag `R` is followed by an optional *[lifetime]* followed by the *[type]* of the reference.
  The lifetime is not included if it has been erased.

* `Q` — A [mutable reference][reference-mutable-reference] `&mut T`.

  > <span id="mut-ref-type">mut-ref-type</span> → `Q` *[lifetime]*<sub>opt</sub> *[type]*

  The tag `Q` is followed by an optional *[lifetime]* followed by the *[type]* of the mutable reference.
  The lifetime is not included if it has been erased.

* `P` — A [constant raw pointer][reference-raw-pointer] `*const T`.

  The tag `P` is followed by the *[type]* of the pointer.

  > <span id="const-ptr-type">const-ptr-type</span> → `P` *[type]*

* `O` — A [mutable raw pointer][reference-raw-pointer] `*mut T`.

  > <span id="mut-ptr-type">mut-ptr-type</span> → `O` *[type]*

  The tag `O` is followed by the *[type]* of the pointer.

* `F` — A [function pointer][reference-fn-pointer] `fn(…) -> …`.

  > <span id="fn-type">fn-type</span> → `F` *[fn-sig]*
  >
  > <span id="fn-sig">fn-sig</span> → *[binder]*<sub>opt</sub> `U`<sub>opt</sub> (`K` *[abi]*)<sub>opt</sub> {*[type]*} `E` *[type]*
  >
  > <span id="abi">abi</span> → \
  > &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; `C` \
  > &nbsp;&nbsp; | *[undisambiguated-identifier]*

  The tag `F` is followed by a *[fn-sig]* of the function signature.
  A *fn-sig* is the signature for a function pointer.

  It starts with an optional *[binder]* which represents the higher-ranked trait bounds (`for<…>`).

  Following that is an optional `U` character which is present for an `unsafe` function.

  Following that is an optional `K` character which indicates that an *[abi]* is specified.
  If the ABI is not specified, it is assumed to be the `"Rust"` ABI.

  The *[abi]* can be the letter `C` to indicate it is the `"C"` ABI.
  Otherwise it is an *[undisambiguated-identifier]* of the ABI string with dashes converted to underscores.

  Following that is zero or more <em>[type]</em>s which indicate the input parameters of the function.

  Following that is the character `E` and then the *[type]* of the return value.

[fn-sig]: #fn-sig
[abi]: #abi

* `D` — A [trait object][reference-trait-object] `dyn Trait<Assoc=X> + Send + 'a`.

  > <span id="dyn-trait-type">dyn-trait-type</span> → `D` *[dyn-bounds]* *[lifetime]*
  >
  > <span id="dyn-bounds">dyn-bounds</span> → *[binder]*<sub>opt</sub> {*[dyn-trait]*} `E`
  >
  > <span id="dyn-trait">dyn-trait</span> → *[path]* {*[dyn-trait-assoc-binding]*}
  >
  > <span id="dyn-trait-assoc-binding">dyn-trait-assoc-binding</span> → `p` *[undisambiguated-identifier]* *[type]*

  The tag `D` is followed by a *[dyn-bounds]* which encodes the trait bounds,
  followed by a *[lifetime]* of the trait object lifetime bound.

  A *dyn-bounds* starts with an optional *[binder]* which represents the higher-ranked trait bounds (`for<…>`).
  Following that is a sequence of *[dyn-trait]* terminated by the character `E`.

  Each *[dyn-trait]* represents a trait bound, which consists of a *[path]* to the trait followed by zero or more *[dyn-trait-assoc-binding]* which list the associated types.

  Each *[dyn-trait-assoc-binding]* consists of a character `p` followed a *[undisambiguated-identifier]* representing the associated binding name, and finally a *[type]*.

[dyn-bounds]: #dyn-bounds
[dyn-trait]: #dyn-trait
[dyn-trait-assoc-binding]: #dyn-trait-assoc-binding


* A *[path]* to a named type.

* A *[backref]* to refer to a previously encoded type.

> **Recommended Demangling**
>
> A *[type]* may be displayed as the type it represents, using typical Rust syntax to represent the type.

> Example:
> ```rust
> fn main() {
>     example::<[u16; 8]>();
> }
>
> pub fn example<T>() {}
> ```
>
> The symbol for function `example` is:
>
> ```text
> _RINvCs7qp2U7fqm6G_7mycrate7exampleAtj8_EB2_
>                                    │││├┘│
>                                    ││││ └─── end of generic args
>                                    │││└───── const data 8
>                                    ││└────── const type usize
>                                    │└─────── array element type u16
>                                    └──────── array type
> ```
>
> Recommended demangling: `mycrate::example::<[u16; 8]>`

## Binder
[binder]: #binder

> binder → `G` *[base-62-number]*

A *binder* represents the number of [higher-ranked trait bound][reference-hrtb] lifetimes to bind.
It consists of the character `G` followed by a *[base-62-number]*.
The value 1 should be added to the *[base-62-number]* when decoding
(such that the *base-62-number* encoding of `_` is interpreted as having 1 binder).

A *lifetime* rule can then refer to these numbered lifetimes.
The lowest indices represent the innermost lifetimes.
The number of bound lifetimes is the value of *[base-62-number]* plus one.

For example, in `for<'a, 'b> fn(for<'c> fn (...))`, any <em>[lifetime]</em>s in `...`
(but not inside more binders) will observe the indices 1, 2, and 3 to refer to `'c`, `'b`, and `'a`, respectively.

> **Recommended Demangling**
>
> A *binder* may be printed using `for<…>` syntax listing the lifetimes as recommended in *[lifetime]*.
> See *[lifetime]* for an example.

## Backref
[backref]: #backref

> backref → `B` *[base-62-number]*

A *backref* is used to refer to a previous part of the mangled symbol.
This provides a simple form of compression to reduce the length of the mangled symbol.
This can help reduce the amount of work and resources needed by the compiler, linker, and loader.

It consists of the character `B` followed by a *[base-62-number]*.
The number indicates the 0-based offset in bytes starting from just after the `_R` prefix of the symbol.
The *backref* represents the corresponding element starting at that position.

<em>backref</em>s always refer to a position before the *backref* itself.

The *backref* compression relies on the fact that all substitutable symbol elements have a self-terminating mangled form.
Given the start position of the encoded node, the grammar guarantees that it is always unambiguous where the node ends.
This is ensured by not allowing optional or repeating elements at the end of substitutable productions.

> **Recommended Demangling**
>
> A *backref* should be demangled by rendering the element that it points to.
> Care should be considered when handling deeply nested backrefs to avoid using too much stack.

> Example:
> ```rust
> fn main() {
>     example::<Example, Example>();
> }
>
> struct Example;
>
> pub fn example<T, U>() {}
> ```
>
> The symbol for function `example` is:
>
> ```text
> _RINvCs7qp2U7fqm6G_7mycrate7exampleNtB2_7ExampleBw_EB2_
>                                      │├┘        │├┘ │├┘
>                                      ││         ││  ││
>                                      ││         ││  │└── backref to offset 3 (crate-root)
>                                      ││         ││  └─── backref for instantiating-crate path
>                                      ││         │└────── backref to offset 33 (path to Example)
>                                      ││         └─────── backref for second generic-arg
>                                      │└───────────────── backref to offset 3 (crate-root)
>                                      └────────────────── backref for first generic-arg (first segment of Example path)
> ```
>
> Recommended demangling: `mycrate::example::<mycrate::Example, mycrate::Example>`

## Instantiating crate
[instantiating-crate]: #instantiating-crate

> instantiating-crate → *[path]*

The *instantiating-crate* is an optional element of the *[symbol-name]* which can be used to indicate which crate is instantiating the symbol.
It consists of a single *[path]*.

This helps differentiate symbols that would otherwise be identical,
for example the monomorphization of a function from an external crate may result in a duplicate if another crate is also instantiating the same generic function with the same types.

In practice, the instantiating crate is also often the crate where the symbol is defined,
so it is usually encoded as a *[backref]* to the *[crate-root]* encoded elsewhere in the symbol.

> **Recommended Demangling**
>
> The *instantiating-crate* usually need not be displayed.

> Example:
> ```rust
> std::path::Path::new("example");
> ```
>
> The symbol for `Path::new::<str>` instantiated from the `mycrate` crate is:
>
> ```text
> _RINvMsY_NtCseXNvpPnDBDp_3std4pathNtB6_4Path3neweECs7qp2U7fqm6G_7mycrate
>                                                                 └──┬───┘
>                                                                    │
>                                                                    └── instantiating crate identifier `mycrate`
> ```
>
> Recommended demangling: `<std::path::Path>::new::<str>`

## Vendor-specific suffix
[vendor-specific-suffix]: #vendor-specific-suffix
[suffix]: #vendor-specific-suffix

> vendor-specific-suffix → (`.` | `$`) *[suffix]*
>
> suffix → {*byte*}

The *vendor-specific-suffix* is an optional element at the end of the *[symbol-name]*.
It consists of either a `.` or `$` character followed by zero or more bytes.
There are no restrictions on the characters following the period or dollar sign.

This suffix is added as needed by the implementation.
One example where this can happen is when locally unique names need to become globally unique.
LLVM can append a `.llvm.<numbers>` suffix during LTO to ensure a unique name,
and `$` can be used for thread-local data on Mach-O.
In these situations it's generally fine to ignore the suffix;
the suffixed name has the same semantics as the original.

> **Recommended Demangling**
>
> The *vendor-specific-suffix* usually need not be displayed.

> Example:
> ```rust
> # use std::cell::RefCell;
> thread_local! {
>     pub static EXAMPLE: RefCell<u32> = RefCell::new(1);
> }
> ```
>
> The symbol for `EXAMPLE` on macOS may have the following for thread-local data:
>
> ```text
> _RNvNvNvCs7qp2U7fqm6G_7mycrate7EXAMPLE7___getit5___KEY$tlv$init
>                                                       └───┬───┘
>                                                           │
>                                                           └── vendor-specific-suffix
> ```
>
> Recommended demangling: `mycrate::EXAMPLE::__getit::__KEY`

## Common rules
[decimal-number]: #common-rules
[digit]: #common-rules
[non-zero-digit]: #common-rules
[lower]: #common-rules
[upper]: #common-rules

> [decimal-number] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; `0` \
> &nbsp;&nbsp; | *[non-zero-digit]* {*[digit]*}
>
> [non-zero-digit] → `1` | `2` | `3` | `4` | `5` | `6` | `7` | `8` | `9` \
> [digit] → `0` | *[non-zero-digit]*
>
> [lower] → `a` |`b` |`c` |`d` |`e` |`f` |`g` |`h` |`i` |`j` |`k` |`l` |`m` |`n` |`o` |`p` |`q` |`r` |`s` |`t` |`u` |`v` |`w` |`x` |`y` |`z`
>
> [upper] → `A` | `B` | `C` | `D` | `E` | `F` | `G` | `H` | `I` | `J` | `K` | `L` | `M` | `N` | `O` | `P` | `Q` | `R` | `S` | `T` | `U` | `V` | `W` | `X` | `Y` | `Z`

A *decimal-number* is encoded as one or more <em>[digit]</em>s indicating a numeric value in decimal.

The value zero is encoded as a single byte `0`.
Beware that there are situations where `0` may be followed by another digit that should not be decoded as part of the decimal-number.
For example, a zero-length *[identifier]* within a *[nested-path]* which is in turn inside another *[nested-path]* will result in two identifiers in a row, where the first one only has the encoding of `0`.

A *digit* is an ASCII number.

A *lower* and *upper* is an ASCII lower and uppercase letter respectively.

## base-62-number
[base-62-number]: #base-62-number

> [base-62-number] → { *[digit]* | *[lower]* | *[upper]* } `_`

A *base-62-number* is an encoding of a numeric value.
It uses ASCII numbers and lowercase and uppercase letters.
The value is terminated with the `_` character.
If the value is 0, then the encoding is the `_` character without any digits.
Otherwise, one is subtracted from the value, and it is encoded with the mapping:

* `0`-`9` maps to 0-9
* `a`-`z` maps to 10 to 35
* `A`-`Z` maps to 36 to 61

The number is repeatedly divided by 62 (with integer division round towards zero)
to choose the next character in the sequence.
The remainder of each division is used in the mapping to choose the next character.
This is repeated until the number is 0.
The final sequence of characters is then reversed.

Decoding is a similar process in reverse.

Examples:

| Value | Encoding |
|-------|----------|
| 0     | `_`      |
| 1     | `0_`     |
| 11    | `a_`     |
| 62    | `Z_`     |
| 63    | `10_`    |
| 1000  | `g7_`    |

## Symbol grammar summary
[summary]: #symbol-grammar-summary

The following is a summary of all of the productions of the symbol grammar.

> [symbol-name] → `_R` *[decimal-number]*<sub>opt</sub> *[path]* *[instantiating-crate]*<sub>opt</sub> *[vendor-specific-suffix]*<sub>opt</sub>
>
> [path] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[crate-root]* \
> &nbsp;&nbsp; | *[inherent-impl]* \
> &nbsp;&nbsp; | *[trait-impl]* \
> &nbsp;&nbsp; | *[trait-definition]* \
> &nbsp;&nbsp; | *[nested-path]* \
> &nbsp;&nbsp; | *[generic-args]* \
> &nbsp;&nbsp; | *[backref]*
>
> [crate-root] → `C` *[identifier]* \
> [inherent-impl] → `M` *[impl-path]* *[type]* \
> [trait-impl] → `X` *[impl-path]* *[type]* *[path]* \
> [trait-definition] → `Y` *[type]* *[path]* \
> [nested-path] → `N` *[namespace]* *[path]* *[identifier]* \
> [generic-args] → `I` *[path]* {*[generic-arg]*} `E`
>
> [identifier] → *[disambiguator]*<sub>opt</sub> *[undisambiguated-identifier]* \
> [undisambiguated-identifier] → `u`<sub>opt</sub> *[decimal-number]* `_`<sub>opt</sub> *[bytes]* \
> [bytes] → {*UTF-8 bytes*}
>
> [disambiguator] → `s` *[base-62-number]*
>
> [impl-path] → *[disambiguator]*<sub>opt</sub> *[path]*
>
> [type] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[basic-type]* \
> &nbsp;&nbsp; | *[array-type]* \
> &nbsp;&nbsp; | *[slice-type]* \
> &nbsp;&nbsp; | *[tuple-type]* \
> &nbsp;&nbsp; | *[ref-type]* \
> &nbsp;&nbsp; | *[mut-ref-type]* \
> &nbsp;&nbsp; | *[const-ptr-type]* \
> &nbsp;&nbsp; | *[mut-ptr-type]* \
> &nbsp;&nbsp; | *[fn-type]* \
> &nbsp;&nbsp; | *[dyn-trait-type]* \
> &nbsp;&nbsp; | *[path]* \
> &nbsp;&nbsp; | *[backref]*
>
> [basic-type] → *[lower]* \
> [array-type] → `A` *[type]* *[const]* \
> [slice-type] → `S` *[type]* \
> [tuple-type] → `T` {*[type]*} `E` \
> [ref-type] →  `R` *[lifetime]*<sub>opt</sub> *[type]* \
> [mut-ref-type] → `Q` *[lifetime]*<sub>opt</sub> *[type]* \
> [const-ptr-type] → `P` *[type]* \
> [mut-ptr-type] → `O` *[type]* \
> [fn-type] → `F` *[fn-sig]* \
> [dyn-trait-type] → `D` *[dyn-bounds]* *[lifetime]*
>
> [namespace] → *[lower]* | *[upper]*
>
> [generic-arg] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[lifetime]* \
> &nbsp;&nbsp; | *[type]* \
> &nbsp;&nbsp; | `K` *[const]*
>
> [lifetime] → `L` *[base-62-number]*
>
> [const] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; *[type]* *[const-data]* \
> &nbsp;&nbsp; | `p` \
> &nbsp;&nbsp; | *[backref]*
>
> [const-data] → `n`<sub>opt</sub> {*[hex-digit]*} `_`
>
> [hex-digit] → *[digit]* | `a` | `b` | `c` | `d` | `e` | `f`
>
> [fn-sig] → *[binder]*<sub>opt</sub> `U`<sub>opt</sub> (`K` *[abi]*)<sub>opt</sub> {*[type]*} `E` *[type]*
>
> [abi] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; `C` \
> &nbsp;&nbsp; | *[undisambiguated-identifier]*
>
> [dyn-bounds] → *[binder]*<sub>opt</sub> {*[dyn-trait]*} `E` \
> [dyn-trait] → *[path]* {*[dyn-trait-assoc-binding]*} \
> [dyn-trait-assoc-binding] → `p` *[undisambiguated-identifier]* *[type]*
>
> [binder] → `G` *[base-62-number]*
>
> [backref] → `B` *[base-62-number]*
>
> [instantiating-crate] → *[path]*
>
> [vendor-specific-suffix] → (`.` | `$`) *[suffix]* \
> [suffix] → {*byte*}
>
> [decimal-number] → \
> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp; `0` \
> &nbsp;&nbsp; | *[non-zero-digit]* {*[digit]*}
>
> [base-62-number] → { *[digit]* | *[lower]* | *[upper]* } `_`
>
> [non-zero-digit] → `1` | `2` | `3` | `4` | `5` | `6` | `7` | `8` | `9` \
> [digit] → `0` | *[non-zero-digit]* \
> [lower] → `a` |`b` |`c` |`d` |`e` |`f` |`g` |`h` |`i` |`j` |`k` |`l` |`m` |`n` |`o` |`p` |`q` |`r` |`s` |`t` |`u` |`v` |`w` |`x` |`y` |`z` \
> [upper] → `A` | `B` | `C` | `D` | `E` | `F` | `G` | `H` | `I` | `J` | `K` | `L` | `M` | `N` | `O` | `P` | `Q` | `R` | `S` | `T` | `U` | `V` | `W` | `X` | `Y` | `Z`

## Encoding of Rust entities

The following are guidelines for how Rust entities are encoded in a symbol.
The compiler has some latitude in how an entity is encoded as long as the symbol is unambiguous.

* Named functions, methods, and statics shall be represented by a *[path]* production.

* Paths should be rooted at the inner-most entity that can act as a path root.
  Roots can be crate-ids, inherent impls, trait impls, and (for items within default methods) trait definitions.

* The compiler is free to choose disambiguation indices and namespace tags from
  the reserved ranges as long as it ascertains identifier unambiguity.

* Generic arguments that are equal to the default should not be encoded in order to save space.


[RFC 2603]: https://rust-lang.github.io/rfcs/2603-rust-symbol-name-mangling-v0.html
[reference-array]: ../../reference/types/array.html
[reference-fn-pointer]: ../../reference/types/function-pointer.html
[reference-hrtb]: ../../reference/trait-bounds.html#higher-ranked-trait-bounds
[reference-identifiers]: ../../reference/identifiers.html
[reference-implementations]: ../../reference/items/implementations.html
[reference-inherent-impl]: ../../reference/items/implementations.html#inherent-implementations
[reference-mutable-reference]: ../../reference/types/pointer.html#mutable-references-mut
[reference-paths]: ../../reference/paths.html
[reference-raw-pointer]: ../../reference/types/pointer.html#raw-pointers-const-and-mut
[reference-shared-reference]: ../../reference/types/pointer.html#shared-references-
[reference-slice]: ../../reference/types/slice.html
[reference-track_caller]: ../../reference/attributes/codegen.html#the-track_caller-attribute
[reference-trait-impl]: ../../reference/items/implementations.html#trait-implementations
[reference-trait-object]: ../../reference/types/trait-object.html
[reference-traits]: ../../reference/items/traits.html
[reference-tuple]: ../../reference/types/tuple.html
[reference-types]: ../../reference/types.html
